Automatic generation of data models and accompanying user interfaces

ABSTRACT

Techniques to generate data models for an item master having a number of items. Each item is associated with a number of attributes and each attribute is associated with a set of values. In one method, the items in the item master are initially classified into a number of pagesets, with each pageset being defined by a unique combination of values for a first set of (classification) attributes. For each pageset, a second set of (selectable) attributes is determined to uniquely identify the items in the pageset. The selectable attributes may be selected from a list of candidate attributes, which may include mandatory attributes designated to be used as selectable attributes and optional attributes that may be selected for use. Data models are generated for each pageset based in part on the selectable attributes, and include a set of tables descriptive of the items in the pageset.

BACKGROUND OF THE INVENTION

[0001] The present invention relates generally to computer processing,and more particularly to techniques for generating data models and userinterfaces for catalog-type applications.

[0002] For some business enterprises, a large number of products oritems may need to be organized and categorized for presentation in aclear and logical manner, such as with a catalog. For example, aretailer or a distributor may carry a large number of items in itsinventory. These items may then be categorized into a number of groups(e.g., hundreds or thousands of groups) of related items. Each group mayinclude one or more items and may be represented with a “pageset”.

[0003] Catalog-type applications such as the one described above aretypically characterized by a large number of relatively simple items.These items may be associated with various attributes used to identifyand describe the items. If the items can be sufficiently described anduniquely identified based solely on their attribute values, then theattributes may be used to classify the items into groups and to furtheridentify the items in each group.

[0004] Each group of items may be represented with “data models” thatdescribe the items in the group. These data models are typically of aparticular defined format or schema and include sufficient informationsuch that they may be used to generate user interface (UI) elements,such as frames or screens for a catalog. Items in each group may then beclearly and logically present via these UI elements. For catalog-typeapplications, the data models tend to be similar from group to group(i.e., pageset to pageset).

[0005] Catalog-type applications tend to be large, with many items, andthe task of organizing and classifying the items becomes morechallenging as the number of items increases. However, catalog-typeapplications also tend to be repetitive, which affords the use ofsimilar data models for representing the groups of items. Techniquesthat can be used to automatically generate data models and userinterfaces for catalog-type applications are thus highly desirable.

SUMMARY OF THE INVENTION

[0006] The invention provides techniques to automatically generate datamodels from an “item master” (e.g., a master table) that includes anumber of items. A set of classification attributes is initiallyprovided (e.g., by an administrator via a user interface screen orautomatically generated) and used to classify the items in the itemmaster into pagesets. Data models may then be automatically generatedfor each pageset based in part on a set of candidate attributes (whichmay also be provided by the administrator via the user interface screenor in a configuration file). The data models are thereafter used togenerate user interface (UI) elements, which can present the items ineach pageset in a clear and logical manner. Various implementations ofthe invention are possible, some of which are described below.

[0007] A specific embodiment of the invention provides a method forgenerating data models for an item master having a number of items. Eachitem in the item master is associated with a number of attributes andeach attribute is associated with a respective set of possible values.In accordance with the method, the items in the item master areinitially classified into a number of pagesets. Each pageset is definedby a unique combination of values for a first set of attributes(referred to as classification attributes). A second set of attributes(referred to as selectable attributes) is then determined for eachpageset, with the selectable attributes being used to uniquely identifythe items in the pageset. Data models are then generated for eachpageset based in part on the selectable attributes. In oneimplementation, the data models include a set of tables descriptive ofthe items in the pageset.

[0008] The classification attributes may be specified (e.g., by anadministrator) via configuration variables. The selectable attributesmay be selected from a list of candidate attributes, which may includemandatory and optional attributes. Mandatory attributes are designatedto be used as selectable attributes. Optional attributes may bespecified in an ordered list and may be selected for use as selectableattributes based on their order in the list. Each pageset includes asufficient (e.g., minimum) number of attributes such that the items inthe pageset are uniquely identified by their selectable attributevalues.

[0009] The data models for each pageset may include a number of featurestables and configuration tables. One feature table is provided for eachselectable attribute and includes a mapping of codes to descriptionscorresponding to all possible attribute values. The configuration tablesidentify valid and invalid configurations for the pageset. Invalidconfigurations may be associated with a number of types of exceptionmessages.

[0010] Output files (e.g., UI elements) are generated for the pagesetsbased on the data models. These output files may include input files forselectable attributes and results files for other attributes associatedwith items in the pageset. A contents list file is also provided andincludes application-specific (as oppose to pageset-specific) data usedto provide a navigation mechanism for the generated pagesets. The outputfiles may be provided as XML documents, HTML files, or in some otherformat.

[0011] Prior to generating the data models, the item master and/orconfiguration variables may be validated, and error messages may begenerated (and provided in a log file) for errors resulting from thevalidation process. The error messages may be used to “clean up” theitem master and/or configuration variables, and the validation processmay be iterated any number of times until valid data is obtained.

[0012] Another specific embodiment of the invention provides a methodfor forming a list of attributes for identifying items in a pageset. Inaccordance with the method, an attribute not yet considered foridentifying the items in the pageset is initially selected. Adetermination is then made whether the selected attribute is useful foridentifying the items in the pageset. If the attribute is useful, thenit is included in the list. One or more additional attributes are thenevaluated in similar manner, one attribute at a time, until a sufficientnumber of attributes is included in the list such that the items in thepageset are uniquely identified by their values for the attributes inthe list. In one embodiment, only attributes that are common for allitems in the pageset are considered for evaluation.

[0013] The invention further provides other methods, computer programproducts, and systems capable of implementing various aspects,embodiments, and features of the invention, as described in furtherdetail below.

[0014] The foregoing, together with other aspects of this invention,will become more apparent when referring to the following specification,claims, and accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015]FIG. 1 shows an example of an item master that lists the availableitems for an enterprise;

[0016]FIG. 2 is a diagram of an embodiment of a system capable ofautomatically generating data models for an item master;

[0017]FIG. 3 shows various tables that may be generated forconfiguration-type data models, in accordance with an embodiment of theinvention;

[0018]FIG. 4A is a flow diagram of an embodiment of a process performedby a data builder module to generate a set of intermediate data filesfrom the item master;

[0019]FIGS. 4B and 4C are flow diagrams of two embodiments of a processto determine a list of selectable attributes that may be used touniquely identify the items in each pageset;

[0020]FIG. 5A is a flow diagram of an embodiment of a process performedby a model builder module to generate data models for each pageset;

[0021]FIG. 5B is a flow diagram of an embodiment of a process to examinethe data for each pageset to generate exception messages for invalidconfigurations;

[0022]FIG. 6 shows an embodiment of a screen capable of presenting itemsin the item master using application files generated from the datamodels;

[0023]FIG. 7 is a diagram of another embodiment of a system capable ofautomatically generating data models using items stored in a repository;and

[0024]FIG. 8 is a block diagram of a computer system.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

[0025]FIG. 1 shows an example of an item master 100 that includes acollection of items (e.g., products) for an enterprise. Item master 100(which may be implemented as a table) includes a number of rows andcolumns, with the specific number of rows and columns being dependent onthe type and quantity of items being represented by the item master. Thefirst row in the item master contains column headers, which identify thevarious attributes for the items in the table. Each subsequent row ofthe item master represents one record for one item. Each record includesinformation for the associated item, or more specifically the values forthe attributes identified by the column headings. Thus, each column maybe used to represent one specific attribute of the items, and each rowmay be used to represent one item.

[0026] In general, item master 100 may include information for any typeof items that have attributes and which may be offered in a catalogcontext. For example, the items in the item master may representproducts, services, solutions sets, employee relationship management(ERM)-based entities such as benefit documents, and other types ofitems. The item master may also be generated in various manners. In oneembodiment, the item master may be generated manually (via key entry)and/or automatically (via a defined process) and is provided in a singledata file. In another embodiment, the item master is generated fromsmaller tables in a relational database (i.e., a repository).

[0027] In the specific example shown in FIG. 1, item master 100 includesproduct data for a clothing enterprise. The first row in the item mastercontains column headers for the following attributes—ID, Gender, Type,Style, Size, Color, Price, and Item Number. Each subsequent row includesa record for one item and includes a set of values for the attributesidentified by the column headers. For clarity, various aspects andembodiments of the invention are described for the example item master100 shown in FIG. 1.

[0028] The item master may include a large listing for many items (e.g.,hundreds or thousands of items). Presentation of the item master in theform shown in FIG. 1 may be cumbersome and unintelligible to an end-user(i.e., a user of a catalog application). For better presentation, theitems may be classified into groups or “pagesets”. Each pageset may bedefined by a specific set of attribute values and may be viewed ascorresponding to a particular product family. Each pageset may includeone or more items having a first set of attribute values that matchthose used to define the pageset. In fact, this first set of attributevalues is used to categorize the items in the item master into theirproper pagesets. The items in each pageset are also associated with asecond set of attribute values that may be used to uniquely identify theitems in the pageset. Since all items in a given pageset have the sameset of values for the attributes in the first set, “uniqueness” for theitems in the pageset is achieved if each item in the pageset has aunique set of values for the second set of attributes (i.e., the set ofvalues for the attributes in the second set for each item in the pagesetis different from the sets of attribute values for all other items inthe pageset). Uniqueness is described in further detail below.

[0029] In an embodiment and as shown in FIG. 1, item master 100 isdefined to includes an identifier (ID) column 110, one or moreClassification columns 120, one or more Candidate Attribute columns 130,and one or more Data Attribute columns 140. ID column 110 lists anitem-specific identifier (e.g., an ID, SKU, or row ID) for each item inthe item master. This identifier may not be unique for all items in theitem master, but is unique for all items in any given pageset. Theunique values for the identifier may be used to uniquely identify theitems in each pageset. Classification columns 120 correspond toattributes used to classify the items in the item master into pagesets.Each pageset is defined by a unique set of classification column valuesand includes one or more items having the same set of attribute valuesused to classify the pageset. Candidate Attribute columns 130 correspondto attributes that may be selected and used to uniquely identify itemswithin each pageset. The names of the Classification and CandidateAttribute columns generally conform to defined naming conventions fortables. Data Attribute columns 140 correspond to additional attributesof the items in the item master. These data attributes are generallyused to provide additional descriptive information for the items and butare typically not used to identify the items in the pageset.

[0030] In the example shown in FIG. 1, the Classification columnsincludes the Gender and Type columns, the Candidate Attribute columnsincludes the Style, Size, and Color columns, and the Data Attributecolumns include the Description (Desc), Price, Item Number, and Imagecolumns.

[0031] Various implementations may be used to classify items in an itemmaster, generate data models, and further generate UI elements (i.e.,run-time applications) based on the data models. In one family ofimplementations (referred to as “file-based”), preparation of the itemmaster and specification of the attributes are mandatory steps(requiring interaction with an administrator, who may be tasked withbuilding the catalog application), and run-time applications (e.g., inHTML files) are generated based on the data models (e.g., using XLST).In another family of implementations (referred to as“repository-based”), preparation of the item master and specification ofthe attributes are optional steps and the data models are generated andsaved to a repository. A publisher module (described below) may then beused to process (and possibly modify) the data models to generate therun-time application. A specific design for each of these two familiesof implementations is described below in FIGS. 2 and 7, respectively.Various other implementations can also be contemplated and are withinthe scope of the invention.

[0032]FIG. 2 is a diagram of a system 200 capable of automaticallygenerating data models for an item master, in accordance with anembodiment of the invention. System 200 is an example of a file-baseddesign, and is implemented as a software program that takes an itemmaster as input and can generate data-dependent components of acatalog-type application. In this embodiment, system 200 (which is alsoreferred to as a “catalog builder”) includes a data builder module 210,a model builder module 220, and a file builder module 230.

[0033] Data builder module 210 receives the item master and a first setof configuration variables, validates the data in the item master,classifies items in the item master into pagesets, identifies whichattributes are to used to identify the items in each pageset, andprovides a set of intermediate data files. Data builder module 210further provides status information indicative of the results of theprocessing on the item master and log information indicative of“uncleanliness” (i.e., errors) in the item master and/or errors in theconfiguration variables. The log information may be used to modify theconfiguration variables and/or the data in the item master (e.g., in aniterative manner) to provide valid data and variables.

[0034] Model builder module 220 receives the intermediate data files anda second set of configuration variables and generates data models. Thedata models may be provided in various forms such as, for example, XMLdocuments, HTML files, formatted files or database tables that may bestored in a repository 250, and others. The XML documents containpageset-specific data including the representation of the data models. Acallout process may be inserted at a callout point in model buildermodule 220 and used to process and possibly modify (e.g., the XMLversion of) the data models before they are generated in final form, asdescribed below.

[0035] File builder module 230 receives the data models (which may beprovided in XML documents) and a third set of configuration variablesand generates datadependent application files. These application fileseither include or may be used to generate UI elements suitable forrepresenting a catalog of the item master, as described below.

[0036] Each module executes on one or more input files and provides aset of output files. In an embodiment, user preferences for theoperation of each module, such as directions for interpreting the itemmaster, output format options, directories of the input and outputfiles, and so on, are provided as configuration variables. Theconfiguration variables for all three modules may be provided in variousforms. In one implementation, the configuration variables are providedin a (global) configuration file. In another implementation, theconfiguration variables may be entered via a user interface screen thatmay be provided (e.g., for each module) to assist an administrator inthe generation of the data models and UI elements.

[0037] These modules and their inputs and outputs are described infurther details below.

[0038] Model builder module 220 may be designed to generate data modelsof various schemas. The specific schema to be used for the data modelsis dependent on various factors such as the data architecture employed,the specific design of a runtime engine that will process the datamodels to generate the required outputs, and so on. Various types ifdata models may also be generated such as, for example, configurationtype and list type. The particular data model type to be generated maybe specified in the configuration variables. For clarity, a specificschema for configuration-type data models is described below.

[0039]FIG. 3 shows various tables that may be generated forconfiguration-type data models, in accordance with an embodiment of theinvention. In this embodiment, the configuration-type data models foreach pageset include one or more Selectable Attribute feature tables310, an Items feature table 320, a main configuration table 330, and aconfiguration sub-table 340. Additional, fewer, and/or different tablesmay also be provided for the configuration-type data models and arewithin the scope of the invention. The following description for thetables and sub-table is for a specific pageset (e.g., Woman's Pants).

[0040] For each pageset, a set of Selectable Attribute feature tables310 is provided for the set of all “selectable” attributes used touniquely identify the items in the pageset, with one SelectableAttribute feature table being provided for each selectable attribute. Inthe example shown in FIG. 3, all three candidate attributes are used asselectable attributes for the pageset. In this case, Style feature table310 a, Size feature table 310 b, and Color feature table 310 c aregenerated for the Style, Size, and Color selectable attributes,respectively. In a typical implementation, each Selectable Attributefeature table is named after the selectable attribute associated withand represented by the feature table.

[0041] Each Selectable Attribute feature table 310 provides a mapping ofcodes and their corresponding descriptions. The codes are used torepresent the possible values for the associated selectable attribute,and the descriptions are texts that are more intelligible to theend-user. The codes are more efficient internal representations for theattribute values, and the description may be displayed in the UIelements for the end-user. Each Selectable Attribute feature tabletypically further provides an indication of which specific attributevalue should be used as the default value for the associated selectableattribute, if none is specified.

[0042] In the embodiment shown in FIG. 3, each Selectable Attributefeature table 310 includes a Code column, a Description (Desc) column,and a Default column. In an embodiment, the values in the Code columnare uniquely distinguishable text string. In an embodiment, the codevalues are numeric and sequentially numbered (e.g., starting from 0).These code values correspond to all possible values for the associatedselectable attribute for the given pageset (and not for the entire itemmaster). The Description column includes values drawn from the column ofthe item master corresponding to the selectable attribute beingrepresented by the feature table. As an example, for the Style featuretable 310 a, code values of 0,1,2, and 3 are used to represent thepossible styles of Dress, Casual, Twill, and Jean, respectively, for thepageset for Woman's Pant. The Default column includes an indication ofwhich value should be used as the default (e.g., Casual is the defaultfor the Style selectable attribute for this pageset).

[0043] Items feature table 320 includes item-specific information, andmay be used to provide additional information not included in theSelectable Attribute feature tables. In an embodiment, the Items featuretable includes a Code column, a Description (Desc) column, an ID column,a Price column, an Item Number column, an Image column, a Defaultcolumn, and zero or more additional columns. The Code column includesdistinguishable values used to represent a key referenced by an Itemscolumn of configuration sub-table 340. The Description, ID, Price, ItemNumber, and Image columns each includes the values drawn from thecorresponding column in the item master, and these columns may also bespecified in the configuration variables. The Default column includes anindication of which row value should be used as the default for theDescription column.

[0044] Additional columns may be added to Items feature table 320, e.g.,by specifying these columns in the configuration variables. Eachadditional column (if any) is typically named after the correspondingspecified column in the item master, and includes data drawn from thatspecified column. In the example shown in FIG. 3, the Price, ItemNumber, and Image columns are added to the Items feature table and thesecolumns include the values drawn from the Price, Item Number, and Imagecolumns in the item master.

[0045] Items feature table 320 includes one row for each item in thepageset. For each item, the values for the columns in the Items featuretable are drawn from the corresponding columns in the item master.

[0046] Main configuration table 330 identifies valid and invalidconfigurations for the selectable attributes in the pageset. Eachpageset is associated with a set of selectable attributes, and eachselectable attribute is further associated with a set of possiblevalues. The permutation of all possible combinations of values for theseselectable attributes would represent all possible items that may beincluded in the pageset. However, a given pageset typically includesonly a subset of all possible items. Each item actually included in thepageset represents a valid combination (i.e., a valid configuration) inthe main configuration table, and items not included in the pageset areinvalid combinations that are represented as “exceptions” in theconfiguration table.

[0047] In the embodiment shown in FIG. 3, main configuration table 330references configuration sub-table 340 for the valid and invalidconfigurations and includes a Sub-table column and a Rule column. TheSub-table column includes the name of the configuration sub-table thatmay be referenced to determine valid and invalid configurations for thepageset. The Sub-table column is also referred to as a “type-99” columnsince it references to another sub-table. In this example, the name ofthe configuration sub-table being referenced is “Attribute_Check”. TheRule column may include rules that may be used to cross reference someother information, e.g., exception messages.

[0048] Configuration sub-table 340 identifies the valid and invalidconfigurations for the pageset. These configurations may be representedin numerous ways, with the more efficient representation being dependenton the specific data in the pageset. In one simple implementation, theconfiguration sub-table may include one entry (i.e., one row) for eachpossible configuration, with the valid configurations being grouped intoone row set and the invalid configurations being grouped into anotherrow set. For many pagesets, the number of valid configurations mayrepresent only a small subset of all possible configurations, the numberof invalid configurations may be large, and it may not be efficient tolist each invalid configuration with its own row in the configurationsub-table. Techniques to more efficiently represent invalidconfigurations are described below.

[0049] In the embodiment shown in FIG. 3, configuration sub-table 340includes one column for each selectable attribute for the pageset (e.g.,Style, Size, and Color columns), the Items column for Items featuretable 320, and zero or more additional columns. For efficiency, theconfiguration sub-table typically uses code values to represent theconfigurations. Thus, each Selectable Attribute column (which is alsoreferred to as a “type-1” column) refers to a corresponding SelectableAttribute feature table. The Items column (which is also referred to asa “type-0” column since it does not refer to another table) includesdata drawn from a specified column in the Items feature table. Eachadditional column (if any) includes either fixed text or data drawn froma specified column in the item master.

[0050] As shown in FIG. 3, configuration sub-table 340 includes a Datarow set that lists valid configurations for the pageset and an Exceptionrow set that lists invalid configurations. The valid and invalidconfigurations may be determined as described below.

[0051] The list-type data models may include Items feature table 320 andmain configuration table 330. Additional, fewer, and/or different tablesmay also be provided for the list-type data models and are within thescope of the invention. The main configuration table uses the Itemsfeature table as a single type-1 column with a value of “*” in the datacell (i.e., match all the rows from the Item feature table).

[0052]FIG. 3 shows one specific design for the data models, which may beused to generate catalog applications. Various other designs for thedata models may also be implemented and are within the scope of theinvention. For example, a data model design that may also be used isdescribed in European Patent Application Serial No. 99309178.4, entitled“Method and Apparatus for Interpreting User Selections in the Context ofa Relation Distributed as a Set of Orthogonalized Sub-Relations,” filedNov. 18, 1999, assigned to the assignee of the present application andincorporated herein by reference. In general, any type of data modelshaving attribute-to-UI relationships may be used in conjunction with thetechniques described herein. Moreover, these data models need not beimplemented with tables.

[0053] Referring back to FIG. 2, data builder module 210 receives theitem master and the first set of configuration variables and provides aset of intermediate data files for model builder module 220. The itemmaster may be in the form shown in FIG. 1 and is typically provided in asingle data file. The configuration variables for data builder module210 may include the following information:

[0054] Item master file name—identifies the particular file thatincludes the item master to be operated on by data builder module 210.

[0055] ID column name—identifies the ID column in the item master.

[0056] List of classification columns—the attributes corresponding tothese columns are used to group the items in the item master intopagesets.

[0057] List of candidate columns—the attributes corresponding to thesecolumns may be selected and used to uniquely identify the items in eachpageset.

[0058] List of data attribute columns—the attributes corresponding tothese columns may be used to further describe the items in the itemmaster.

[0059] Columns that are trigger-target pairs—an attribute may be used asa trigger for another attribute. For example, different sets of sizesfor pants may be applicable for different styles of pants. In this case,the style attribute is used as a trigger to determine the proper set ofsizes for that style of pant.

[0060] Columns that will be added to the Items feature table—identifiesthe columns in the item master that will be added to the Items featuretable.

[0061] Directory path for the log and intermediate data files—identifiesthe location where the log and intermediate data files are to be saved.

[0062] Toggle for auto-conversion of single-widget data models tolist-type—

[0063] The configuration variables for data builder module 210 listedabove are for a specific implementation. For other implementations, theconfiguration variables may include additional, fewer, and/or differentinformation than that listed above, and this is within the scope of theinvention.

[0064]FIG. 4A is a flow diagram of an embodiment of a process 400performed by data builder module 210 to generate the intermediate datafiles from the item master. Initially, the item master is validated toidentify any “uncleanliness” in the data that would prevent thegeneration of complete data models for the item master, at step 410.This validation may entail checking the item master to ensure that (1)no two rows have duplicate data, (2) the attributes for theClassification and Candidate Attribute columns are not blank (i.e., noempty strings), and so on. The configuration data may also be validated,at step 412. If any errors in the item master and/or configuration dataare encountered, as determined in step 414, then error messages aregenerated and included in a log file that is made available to theadministrator, at step 416. Via the log file, the administrator isinformed of the errors and can clean up the input data. Steps 410through 416 may be iteratively performed until the data in the itemmaster and the configuration data are validated.

[0065] Once the data is validated, data builder module 210 groups theitems in the item master into pagesets, at step 418. This is achievedbased on the attribute values in the Classification columns identifiedby the configuration variables. In particular, each unique set ofattribute values for the Classification columns is associated with aseparate pageset. All items in the item master having the same set ofattribute values for the Classification columns are grouped into thesame pageset.

[0066] The grouping of the items in the item master into pagesets may beperformed by traversing the item master, one record at a time. For eachrecord, the Classification column values are determined. If this set ofvalues is unique, then a new pageset is defined and the record isgrouped into that pageset. Otherwise, the record is grouped into apageset previously defined for another item in the item master. A columnmay be provided in the item master to mark the particular pageset towhich each item belongs.

[0067] Table 100 in FIG. 1 shows an example of the grouping of the itemsinto pagesets. The number of pagesets is equal to the number of uniquesets of classification column values.

[0068] Once the items in the item master are grouped into pagesets, databuilder module 210 identifies a list of attributes that may be used touniquely identify the items in each pageset, at step 420. Theseattributes are referred to as selectable attributes. In an embodiment,one set of selectable attributes is provided for each pageset, anddifferent pagesets may be associated with different sets of selectableattributes. The selectable attributes are chosen from those associatedwith the Candidate Attribute columns identified in the configurationvariables. The selectable attributes for each pageset may be determinedas described below in FIG. 4B.

[0069] Data builder module 210 then generates output files based on thepageset data, at step 422. In an embodiment, these files include (1) averbose log file that may be used to provide information, warning,error, and so on, which may be provided as feedback to an administratorregarding the quality of the data, (2) a status file that lists allpagesets to be generated, the pageset name, items, selectableattributes, and data model type, and (3) a set of intermediate datafiles to be used by model builder module 220 to generate data models.Additional, fewer, and/or different output files may also be generatedand are within the scope of the invention. The processing by databuilder module 210 then terminates.

[0070] In an embodiment, the configuration variables identify a list ofCandidate Attribute columns, and the attributes corresponding to thesecolumns (which are also referred to as candidate attributes) may beselected and used to uniquely identify the items in each pageset. Foreach pageset, a (minimum) number of candidate attributes may be selected(which are then referred to as selectable attributes) such that eachitem in the pageset may be uniquely identified based on these selectedattributes. Since each of these selected attributes may also be selected(i.e., configured with a value) by the end-user via the catalogapplication (e.g., a UI screen), they are also referred to as selectableattributes. The designation of the attributes in the item master ascandidate attributes and/or the selection of the candidate attributes asselectable attributes may be made by the administrator (e.g., specifiedvia the configuration variables), automatically by data builder module210, or a combination of both.

[0071] In an embodiment, the candidate attributes are grouped into twocategories labeled as “mandatory” and “optional”. Mandatory attributesare those attributes designated by the administrator to be used asselectable attributes (and may or may not be helpful in determininguniqueness among items in a pageset). Optional attributes are those thatmay be selected for use to uniquely identify items if the mandatoryattributes are not sufficient to determine uniqueness. The designationof each candidate attribute as either mandatory or optional may be madeby the administrator or via another means.

[0072] In an embodiment, the optional attributes are provided in anordered list, and these attributes are thereafter selected for use todetermine uniqueness, one at a time and as needed, based on their orderin the list. Thus, the first optional attribute in the list isconsidered first to determine whether or not it is useful for itemidentification, the second optional attribute in the list is considerednext, and so on, and the last optional attribute in the list isconsidered last.

[0073]FIG. 4B is a flow diagram of an embodiment of a process 420 todetermine a list of selectable attributes that may be used to uniquelyidentify the items in each pageset. Initially, the number of uniqueitems in the pageset is determined, at step 442. This may be achieved bysimply counting the number of items in the item master belonging to thepageset being processed. This number is denoted as “A”.

[0074] The optional attributes are then placed in a first list in theorder specified in the configuration variables, at step 444. Theseoptional attributes may be considered, one at a time if necessary and inthe order in which they are placed on the first list, to determineuniqueness. The mandatory attributes (if any) are placed in a secondlist, at step 446. The number of sets of unique values for theattributes in the second list is then determined, at step 448. Thisnumber is denoted as “B”.

[0075] A determination is then made whether the number of uniqueattribute value sets is equal to the number of unique items in thepageset (i.e., whether A=B), at step 450. If these numbers are equal,indicating that the mandatory attributes in the second list aresufficient to uniquely identify the items in the pageset, then theprocess proceeds to step 468. Otherwise, if the mandatory attributes arenot sufficient to determine uniqueness, the optional attributes areconsidered, one by one, until a sufficient number of optional attributesis included to specify item uniqueness.

[0076] The consideration of the optional attributes begins in step 452,where a determination is made whether the first list of optionalattributes is empty. If the first list is empty, then an error messagemay be generated in the log file, at step 454, and the processterminates. Otherwise, if the first list is not empty, then the highestorder optional attribute in the first list is selected for considerationand placed in the second list, at step 456. The current value of B isthen saved as C, in step 458, and the number of sets of unique valuesfor the attributes in the second list is determined and saved as the newvalue of B, at step 460.

[0077] A determination is then made whether the number of uniqueattribute value sets is equal to the number of unique items in thepageset (i.e., whether A=B), at step 462. If these numbers are equal,indicating that the mandatory and optional attributes in the second listare sufficient to uniquely identify the items in the pageset, then theprocess proceeds to step 468.

[0078] If B is not equal to A at step 462, then a determination is madewhether the number of unique attribute value sets with the latestoptional attribute is greater than the number of unique attribute valuesets without the latest optional attribute (i.e., whether B>C), at step464. If B is not greater than C, indicating that the latest optionalattribute was not useful in determining uniqueness, then this attributeis removed from the second list, at step 466. Otherwise, the optionalattribute is retained in the second list. In either case, the processthen returns to step 452 to consider the next optional attribute.

[0079] At step 468, since the number of unique attribute value sets isequal to the number of unique items in the pageset, the second list isprovided as the list of selectable attributes that may be used tospecify item uniqueness for the pageset. The process then terminates.

[0080]FIG. 4C is a flow diagram of another embodiment of a process 470to determine a list of selectable attributes. Initially, the items for aparticular pageset to be processed are identified, at step 472. Adetermination is then made whether or not a list of mandatory attributesis empty, at step 474. If this list is not empty, then all mandatoryattributes are moved to the selectable attribute list, at step 476.Otherwise, the first element of a list of optional attributes is movedto the selectable attribute list, at step 478.

[0081] The items in the pageset that can be uniquely identified by thesets of values for the attributes in the selectable attribute list arethen marked, at step 480. A determination is then made whether there areany unmarked items in the pageset, at step 482. If all items are marked,then the selectable attribute list is returned as the list of selectableattributes that may be used to specify item uniqueness for the pageset,at step 484. The process then terminates.

[0082] Otherwise, if there is any unmarked item in the pageset, adetermination is made whether or not the optional attribute list isempty, at step 486. If the optional attribute list is empty, then anerror message may be generated in the log file, at step 488, and theprocess terminates. And if the optional attribute list is not empty,then a determination is made whether adding the first element of theoptional attribute list to the selectable attribute list would help touniquely identify the unmarked items in the pageset, at step 490. If theanswer is no, then the first element of the optional attribute list isdiscarded, at step 492. Otherwise, the first element of the optionalattribute list is moved to the selectable attribute list, at step 494,and the process returns to step 480.

[0083] The process shown in FIG. 4B or 4C may be executed for eachpageset in the item master and provides a list of selectable attributesthat may be used to determine uniqueness for each pageset. In theembodiment shown, the minimum number of selectable attributes isprovided for each pageset, since optional attributes that do notcontribute to item identification are removed. Moreover, the attributesto be considered and their order for consideration may be specified(e.g., by the administrator via the configuration variables) or may beautomatically determined (e.g., by data builder module 210).

[0084] The process to select attributes to specify item uniquenessresults in the creation of configuration-type data models. A particularcombination of Classification column values may also be specified togenerate list-type data models. Configuration-type data models may alsobe automatically converted into list-type data models (list-type datamodels may be generated from the item master) via a parameter value inthe first set of configuration variables provided to data builder module210. This process of selecting selectable attributes, which createsconfiguration-type data models, may be overridden in the configurationfile by specifying that a particular combination of classificationcolumn values should generate list-type data models instead. Theadministrator may also select to automatically convertconfiguration-type data models that contain only one selectableattribute into list-type data models.

[0085] Model builder module 220 receives the set of intermediate datafiles from data builder module 210 and the second set of configurationvariables and generates data models that may be provided to file buildermodule 230 and/or stored to repository 250. The data models are providedin one or more formats which may be specified (e.g., via theconfiguration variables). The configuration variables for model buildermodule 220 may include the following information:

[0086] Output format—XML, HTML, repository, or a combination thereof

[0087] Overwrite existing data models—true or false

[0088] Directory path for the log file

[0089] Directory containing the intermediate data files

[0090] Gateway repository project name (if saving to the repository)

[0091] Gateway database connect string (if saving to the repository)

[0092] HTML file destination directory (if using HTML)

[0093] Backup HTML files (if using HTML)—true or false

[0094] XML output directory

[0095] Backup XML files—true or false

[0096] Callout process (optional)

[0097] The configuration variables for model builder module 220 listedabove are for a specific implementation. For other implementations, theconfiguration variables may include additional, fewer, and/or differentinformation than those listed above, and this is within the scope of theinvention.

[0098]FIG. 5A is a flow diagram of an embodiment of a process 500performed by model builder module 220. Initially, model builder module220 processes the data for each pageset to generate a set of tables andsub-table for the data models, as shown in FIG. 3, at step 510. Modelbuilder module 220 then identifies valid configurations and generatesexceptions for each pageset, at step 512. Exception messages may begenerated to identify invalid configurations, and these messages may begenerated by passing the pageset data through several “methods” insequential order, as described below in FIG. 5B.

[0099] Model builder module 220 then represents the data models, e.g.,in XML and adds to this XML other pageset-specific data, at step 514.The pageset-specific data includes raw data from the item master andinformation generated by data builder module 210 such as whichattributes are mandatory and which are optional.

[0100] In an embodiment, model builder module 220 supports the inclusionof an optional administrator-specified callout process to be applied tothe data models. Via the callout process, the administrator is able toexamine and modify the data models before they are provided in finaloutput form. If the data models are provided as XML pagesets, as for theembodiment described herein, the callout processes are designed with thecapability to operate on streamed XML as both input and output. Anynumber of processes may be used as the in-line callout process.

[0101] Thus, the XML may be passed streaming through an optional calloutprocess, at step 516. After streaming through the callout process, theXML may be validated before the final data models are generated, at step518. Model builder module 220 then provides the optionally modified andvalidated XML representation for each pageset in one or more outputforms, which may be specified by the administrator via the configurationvariables.

[0102] First, XML documents may be created (e.g., as specified by theconfiguration variables and/or as the default form), at step 520. Inthis case, one XML document is provided for each pageset, with the XMLdocument containing the data models and other pageset-specificinformation such as data for the pageset items obtained from the itemmaster. A master XML document containing application-specific data (asopposed to pageset-specific data) and references to all of the pagesetdocuments is also created and provided. This master XML document may beused to generate the contents list for the item master, as describedbelow. Second, the data models may optionally be provided to repository250, at step 522. Third, the data models may optionally be used directlyto generate HTML files (*_00.htm and *_m.htm files), at step 524. TheHTML files may be read directly by a runtime engine, which can allow forrapid generation of pageset screens. However, the data models in HTMLform may not be easily modified by a subsequent process.

[0103]FIG. 5B is a flow diagram of an embodiment of a process to examinethe data for each pageset to generate exception messages for invalidconfigurations. In an embodiment, logical guidance for generating theexception messages is derived from the pageset data itself. Varioustypes of exception messages may be generated for invalid configurations.These data-dependent exception messages can greatly reduce the number ofmessages that needs to be generated and further reduce the number ofentries needed to represent invalid configurations in the configurationsub-table.

[0104] Initially, the pageset data is examined to identify anyselectable attribute that has only one value (i.e., the attribute valueoccurs only once in the valid configurations for the pageset), at step542. For each such attribute value, a first type of exception messagemay be generated such as, e.g., “[selectable attribute value] is onlyavailable with [list of other selectable attribute values an item isavailable with].” Next, the pageset data is examined to identify anypair of selectable attribute values that do not occur together in avalid configuration (e.g., red dress), at step 544. For each suchattribute pair, a second type of exception message may be generated suchas, e.g., “[attribute 1 value] is not available with [attribute 2value]”. The first two types of exception messages are thus effectivelygenerated from valid configurations. Typically, the first two types ofexception messages cover a large percentage of all invalidconfigurations. Finally, the pageset data is examined to identify allremaining invalid configurations, at step 546. For each such invalidconfiguration, a third type of exception message may be generated suchas, e.g., “[the combination of selected attribute values is an invalidconfiguration].”

[0105] Each of the steps described above may be performed via arespective method. The exception messages may also be provided in a logfile that is provided by model builder module 220. The log file providesresult of the data modeling so that the administrator can review theresult.

[0106] File builder module 230 receives the data models (e.g., the XMLdocuments) from model builder module 220 and the third set ofconfiguration variables and generates data-dependent application files.The configuration variables for file builder module 230 may include thefollowing information:

[0107] Path of the executable for a Xalan XSL processor

[0108] Directory for XML documents

[0109] Directory for the log file output

[0110] For each XSLT stylesheet to be used:

[0111] Stylesheet name

[0112] Directory for stylesheet output

[0113] Backup stylesheet output—true or false

[0114] The configuration variables for file builder module 230 listedabove are for a specific implementation. For other implementations, theconfiguration variables may include additional, fewer, and/or differentinformation than those listed above, and this is within the scope of theinvention.

[0115] File builder module 230 generates application files from thereceived XML documents. In an embodiment, these application filesinclude a Contents List page, one or more Inputs pages, and one or moreOutput pages. The application files may either include or be used togenerate UI elements suitable for representing the item master.

[0116] The Contents List page is generated from the attributescorresponding to the Classification columns in the item master and areused to provide a hierarchical tree of the pagesets for the item master.The hierarchical tree may include any number of levels, with one levelbeing provided for each classification column. For the example shown inFIGS. 1 and 3, the first level may be Gender and the second level may beType. The Contents List page provides a means for an end-user tonavigate through the item master to arrive at the desired pageset.

[0117] The Input pages represent the selectable attributes and aregenerated based on the Selectable Attribute feature tables, the mainconfiguration table, and the configuration sub-table. Typically, oneInput page is generated per pageset, and each Input page includes allselectable attributes for the pageset. Upon selection of a particularpageset by the end-user, the Input page for the selected pageset may bedisplayed. Depending on the specific implementation, the Input page mayallows the end-user to view all valid configurations for the pageset, ormay allow the end-user to select a particular configuration and respondwhether the selected configuration is valid or invalid. The Input pagesreference the Selectable Attribute feature tables.

[0118] The Output pages represent the additional data for the items ineach pageset. This data may include the data in the Data Attributecolumns in the item master (e.g., the Price, Item Number, and Imagecolumns in the item master shown in FIG. 1). The data in the Outputpages may be presented in various manners. In one implementation, uponselection of a particular valid configuration via the Input page, theadditional data corresponding to the selected configuration is retrievedfrom the Output page and presented to the end-user.

[0119]FIG. 6 shows an embodiment of a screen 600 capable of presentingitems in the item master. In this embodiment, the screen includes threeframes 610, 620, and 630 generated from the application files and usedto display the Contents List page, the Input page, and Output page,respectively.

[0120] The Contents List page is rendered in frame 610, via which theend-user is able to navigate through various classification attributesto arrive at the desired pageset. In the example shown, theclassification attributes are presented via a hierarchical treestructure, with the Gender level including two choices (Woman and Man)and each Gender choice further including a number of choices. In anotherimplementation, the classification attributes may also be representedwith a set of list boxes, one list box for each classificationattribute, with each list box including the possible choices for theclassification attribute. The values in each classification attributelist box may be dependent on the values selected for otherclassification attributes. The specific set of values selected for allclassification attributes directs the end-user to the associatedpageset.

[0121] The Input page for the selected pageset is rendered in frame 620,which provides the list of selectable attributes. In an embodiment, thedefault values are populated in the list boxes for the selectableattributes. In an embodiment, a specific value may be selected for eachselectable attribute (e.g., to override the default value). Uponselection of a specific set of values for all selectable attributes, theconfiguration corresponding to this specific set of values may bechecked as to determine whether it is valid or invalid. For example, theend-user may select the configuration of a blue, size 2, dress pant. Ifthe configuration is valid, the Output page for the selectedconfiguration is displayed in frame 630. For the above example, theadditional data for the selected configuration may include the price of$59.95 and the item number of 128. Otherwise, if the configuration isnot valid, the appropriate exception message may be displayed, e.g., inframe 630.

[0122] The application files may be generated based on a particulartemplate. The use of the template allows for flexibility in creatingboth the contents and the data-based logic of the application files.

[0123] The template may be provided via a file, specified via a userinterface screen, or provided via some other means. Default templatesmay be provided and used for creating the application files. Thetemplates may be modified (or customized) to suit the specificapplication design.

[0124] In one specific implementation, the templates comprise XSLT(Extensible Style Language Transformations) stylesheets. In thisimplementation, file builder module 230 includes runtime Java and aJava-based Xalan XSL (Extensible Style Language) processor, which ispublicly available. The Xalan XSL processor is a particularimplementation of XSL transformation, and is used to convert XMLdocuments to other types of documents such as XML, HTML, PDF, andpossibly others. Other types of templates may also be defined and usedand are within the scope of the invention. For example,

[0125] File builder module 230 further provides a log file that providesinformation for the administrator.

[0126] System 200 may be launched to execute the data, model, and filemodules in the proper sequential order. The execution may be initiatedvia a single batch file, which is a “wrapper” script. The wrapper scriptmay receive a configuration variable indicating a starting point moduleand starts execution from this starting point module. The starting pointmay be data builder module 210, model builder module 220, or filebuilder module 230. In this way, the administrator is able to run onlythe desired module(s) of system 200, e.g., those that have not yet beenrun.

[0127]FIG. 7 is a diagram of a system 700 capable of automaticallygenerating data models using items stored in a repository, in accordancewith another embodiment of the invention. System 700 an example of arepository-based design, and also takes an item master as input and cangenerate data-dependent components of a catalog-type application. Inthis embodiment, system 700 includes a catalog admin module 710, acatalog builder module 720, a designer module 730, and a publishermodule 740. Catalog builder module 720 further includes a data/modelbuilder module 722 and a contents list builder module 724. A database750 provides the data for system 700 and further stores the data modelsgenerated by system 700.

[0128] The item master from the database may be comprised of a number ofsmaller tables that may be represented using any number of (relational)database schema. In that case, the collection of all attributes for theitems in the item master may not apply to each item (i.e., not all itemsin the item master may be associated with all of the attributes). In anembodiment, only attributes that are common to all items in the itemmaster are considered for use as classification attributes, and onlyattributes that are common to all items in each pageset are consideredfor use as candidate attributes for that pageset. Other attributes thatapply to only some of the items in the item master would then bedesignated as data attributes.

[0129] Catalog admin module 710 receives the item master and possiblyother data from database 750 and provides classification andconfiguration data. In an embodiment, all or a subset of the commonattributes for the item master may be presented to an administrator(e.g., via a screen), who may then select a set of these commonattributes to classify the item master. This allows the administrator tocategorize the item master and define pagesets in any desired manner.This first set of attributes comprises the classification data for theitem master. The classification data is then provided to data/modelbuilder module 722 and used to classify the items in the item masterinto pagesets, as described above. The classification data is alsoprovided to contents list builder module 724 and used to generate aContents List table.

[0130] In an embodiment, all or a subset of the common attributes (whichare not classification attributes) for each pageset may also bepresented to the administrator, who may then select a set of theseattributes as candidate attributes for the pageset. The administratormay also define configuration variables (e.g., such as those describedabove) via catalog admin module 710.

[0131] Within catalog builder module 720, data/model builder module 722receives (1) the item master and possibly extended attributes for theitems in the item master from database 750, (2) the classification datafrom catalog admin module 710, and (3) configuration data, which may beprovided via a file and/or by catalog admin module 710. Data/modelbuilder module 722 performs many of the functions described above fordata builder module 210 and model builder module 220. Data/model buildermodule 722 may first generate an item master (on the fly) similar tothat shown in FIG. 1 if the data/model builder module is provided with acollection of smaller “normalized” tables that collectively defines theProducts table. A normalized table is typically provided for eachattribute (e.g., color, size). The use of normalized tables may reducethe amount of redundancy (i.e., without normalization, many combinationsof redundant information may be present).

[0132] Data/model builder module 722 then classify the items in the itemmaster into pagesets using the classification data, and furthergenerates data models for each pageset using the candidate attributesand configuration data. The candidate attributes may be provided bycatalog admin module 710, via the configuration data, and/or derivedfrom the extended attributes by data/model builder module 722. In anembodiment and as described above, the data models for each pageset mayinclude a set of feature tables, a main configuration table, and aconfiguration sub-table. These tables and sub-table may be generated asdescribed above in FIG. 3. The data models may be stored back todatabase 750, or may be provided directly to publisher module 740.

[0133] Contents list builder module 724 receives the classification dataand generates a Contents List table, which may be used for navigationthrough the item master as described above. The Contents List table mayalso be stored back to database 750, or may be provided directly topublisher module 740.

[0134] Catalog builder module 720 may also provide a log file, which mayinclude error messages indicating any uncleaniness in the received itemmaster and/or the configuration data. The information provided in thelog file may be used to clean the item master (e.g., modify item valuesand/or attributes), the configuration data, and/or the classificationdata, such that valid data models may be generated.

[0135] Designer module 730 provides graphical user interface (GUI) toolsto assist the administrator manually perform data modeling, createattributes, select default, and perform other functions. Designer modulemay be used to further modify or customize data model and/or contentlist that are generated by catalog builder module 720. Designer module730 may also be used to publish the result of the data modeling.

[0136] Publisher module 740 receives the data models and the ContentsList table from database 750 and generates UI elements suitable fordisplay on a screen. In an embodiment, publisher module 740 generates aContents List page based on the Contents List table and a number ofpagesets based on the data models. These contents list page and pagesetsmay be presented as HTML files or in some other format.

Computer System

[0137]FIG. 8 is a block diagram of an embodiment of a computer system800 that may be used to store and execute program codes that implementsystem 200 or 700. System 800 includes a bus 808 that interconnectsmajor subsystems such as one or more processors 810, a memory subsystem812, a data storage subsystem 814, an input device interface 816, anoutput device interface 818, and a network interface 820. Processor(s)810 perform many of the processing functions for system 800 andcommunicate with a number of peripheral devices via bus 808.

[0138] Memory subsystem 812 may include a RAM 832 and a ROM 834 used tostore codes and data that implement various aspects of the invention. Ina distributed environment, the program codes and data may be stored on anumber of computer systems and used by the processors of these systems.Data storage subsystem 814 provides non-volatile storage for programcodes and data, and may include a hard disk drive 842, a floppy diskdrive 844, and other storage devices 846 such as a CD-ROM drive, anoptical drive, and removable media drive.

[0139] Input device interface 816 provides interface with various inputdevices such as a keyboard 852, a pointing device 854 (e.g., a mouse, atrackball, a touch pad, a graphics tablet, a scanner, or a touchscreen), and other input device(s) 856. Output device interface 818provides an interface with various output devices such as a display 862(e.g., a CRT or an LCD) and other output device(s) 864. Networkinterface 820 provides an interface for system 800 to communicate withother computers coupled to communication network 822.

[0140] Many other devices or subsystems (not shown) may also be coupledto system 800. In addition, it is not necessary for all of the devicesshown in FIG. 8 to be present to practice the invention. Furthermore,the devices and subsystems may be interconnected in configurationsdifferent from that shown in FIG. 8. One or more of the storage devicesmay be located at remote locations and coupled to system 800 viacommunication network 822. The operation of a computer system such asthat shown in FIG. 8 is readily known in the art and not described indetail herein. The source codes to implement various aspects andembodiments of the invention (e.g., sub-configuration) may beoperatively disposed in memory subsystem 812 or stored on storage mediasuch as a hard disk, a floppy disk, or a CD-ROM that is operative with aCD-ROM player.

[0141] Headings are provided herein for reference and to aid in locatingcertain sections. These headings are not intended to limit the scope ofthe concepts described therein under, and these concepts may haveapplicability in other sections throughout the entire specification.

[0142] The foregoing description of the specific embodiments is providedto enable any person skilled in the art to make or use the presentinvention. Various modifications to these embodiments will be readilyapparent to those skilled in the art, and the generic principles definedherein may be applied to other embodiments without the use of theinventive faculty. Thus, the present invention is not intended to belimited to the embodiments shown herein but is to be accorded the widestscope consistent with the principles and novel features disclosedherein, and as defined by the following claims.

What is claimed is:
 1. A computer program product for generating datamodels for an item master with a plurality of items, comprising acomputer-usable medium having embodied therein computer-readable programcodes for classifying the items in the item master into a plurality ofpagesets, wherein each item in the item master is associated with aplurality of attributes and each attribute is associated with arespective set of possible values, and wherein each pageset is definedby a unique combination of values for a first set of attributes;determining a second set of attributes for each pageset, wherein theattributes in the second set are used to identify items in theassociated pageset; and generating data models for each pageset based inpart on the associated second set of attributes.
 2. The computer programproduct of claim 1, wherein the computer-usable medium is furtherembodied with computer-readable program codes for receiving a first setof configuration variables, and wherein the first set of attributes isspecified in the first set of configuration variables.
 3. The computerprogram product of claim 1, wherein the computer-usable medium isfurther embodied with computer-readable program codes for validatingdata in the item master.
 4. The computer program product of claim 3,wherein the computer-usable medium is further embodied withcomputer-readable program codes for generating a first log file witherrors resulting from the validating.
 5. The computer program product ofclaim 1, wherein the data models includes a set of tables descriptive ofthe items in the pageset.
 6. The computer program product of claim 5,wherein the data models for each pageset include a plurality of featurestables, one feature table for each attribute in the second set ofattributes for the pageset, and a configuration table indicative ofvalid and invalid configurations for the items in the pageset.
 7. Thecomputer program product of claim 6, wherein invalid configurations arerepresented in part by a plurality of types of exception messages. 8.The computer program product of claim 7, a first type of exceptionmessages corresponds to attributes in the second set having one value.9. The computer program product of claim 7, a second type of exceptionmessages corresponds to a pair of attributes in the second set havingmutually exclusive sets of values.
 10. The computer program product ofclaim 1, wherein the computer-usable medium is further embodied withcomputer-readable program codes for storing the data models to arepository.
 11. The computer program product of claim 1, wherein thecomputer-usable medium is further embodied with computer-readableprogram codes for generating output files for the plurality of pagesetsbased on the generated data models.
 12. The computer program product ofclaim 11, wherein the output files include a plurality of pagesetsfiles, one for each of the plurality of pagesets, each pageset fileincluding data for the associate pageset, and a contents list fileincluding data non-specific to the pagesets.
 13. The computer programproduct of claim 11, wherein the output files are provided as XMLdocuments.
 14. The computer program product of claim 11, wherein theoutput files are provided as HTML files.
 15. The computer programproduct of claim 1, wherein the second set of attributes for eachpageset includes a sufficient number of attributes such that the itemsin the pageset are uniquely identified by their values for theattributes included in the second set.
 16. The computer program productof claim 1, wherein the second set of attributes for each pagesetincludes a minimum number of attributes such that the items in thepageset are uniquely identified by their values for the attributesincluded in the second set.
 17. The computer program product of claim 1,wherein the attributes in the second set are selected from among aplurality of candidate attributes.
 18. The computer program product ofclaim 17, wherein the plurality of candidate attributes include one ormore mandatory attributes designated to be included in the second set.19. The computer program product of claim 17, wherein the plurality ofcandidate attribute includes one or more optional attributes selectablefor inclusion in the second set.
 20. The computer program product ofclaim 19, wherein the one or more optional attributes are provided in anordered list, and each optional attribute is considered for inclusion inthe second set based on its order in the ordered list.
 21. A computerprogram product for forming a list of attributes for identifying aplurality of items in a pageset, comprising a computer-usable mediumhaving embodied therein computer-readable program codes for receivingthe pageset of items, wherein each item is defined by a uniquecombination of values for a plurality of attributes and each attributeis associated with a respective set of possible values; selecting anattribute not yet considered for identifying items in the pageset;determining whether the selected attribute is useful for identifyingitems in the pageset; including the selected attribute in the list ifthe attribute is useful for identifying items in the pageset; andrepeating the selecting, determining, and including for each of one ormore additional attributes until a sufficient number of attributes isincluded in the list such that the items in the pageset are uniquelyidentified by their values for the attributes in the list.
 22. Thecomputer program product of claim 21, wherein attributes to beconsidered for identifying items are provided in an ordered list, andwherein the attributes in the ordered list are selected forconsideration based on their order in the ordered list.
 23. The computerprogram product of claim 21, wherein attributes to be considered foridentifying items are attributes common for all items.
 24. The computerprogram product of claim 23, wherein each item in the pageset includes acombination of valid values for the common attributes.
 25. The computerprogram product of claim 21, wherein the selected attribute is deemed asuseful for identifying items if more items in the pageset may beuniquely identified with the selected attribute.
 26. A computer programproduct for forming a list of attributes for identifying a plurality ofitems in a pageset, comprising a computer-usable medium having embodiedtherein computer-readable program codes for receiving the pageset ofitems, wherein each item is defined by a unique combination of valuesfor a plurality of attributes and each attribute is associated with arespective set of possible values; selecting an attribute not yetconsidered for identifying items in the pageset; including the selectedattribute in the list; determining whether the list of attributes issufficient to uniquely identify the items in the pageset; and repeatingthe selecting, including, and determining for each of one or moreadditional attributes until a sufficient number of attributes isincluded in the list such that the items in the pageset are uniquelyidentified by their values for the attributes in the list.
 27. Thecomputer program product of claim 26, wherein the computer-usable mediumis further embodied with computer-readable program codes for determiningwhether the selected attribute is useful for identifying items; andremoving the selected attribute from the list if the attribute is notuseful for identifying items.
 28. In a computer system, a method forgenerating data models for an item master with a plurality of items,comprising: classifying the items in the item master into a plurality ofpagesets, wherein each item in the item master is associated with aplurality of attributes and each attribute is associated with arespective set of possible values, and wherein each pageset is definedby a unique combination of values for a first set of attributes;determining a second set of attributes for each pageset, wherein theattributes in the second set are used to identify items in theassociated pageset; and generating data models for each pageset based inpart on the associated second set of attributes, wherein the data modelsincludes a set of tables descriptive of items in the pageset.